System and method for prioritizing data in a cache

ABSTRACT

Implementations described and claimed herein provide a system and methods for prioritizing data in a cache. In one implementation, a priority level, such as critical, high, and normal, is assigned to cached data. The priority level dictates how long the data is cached and consequently, the order in which the data is evicted from the cache memory. Data assigned a priority level of critical will be resident in cache memory unless heavy memory pressure causes the system to reclaim memory and all data assigned a priority state of high or normal has been evicted. High priority data is cached longer than normal priority data, with normal priority data being evicted first. Accordingly, important data assigned a priority level of critical, such as a deduplication table, is kept resident in cache memory at the expense of other data, regardless of the frequency or recency of use of the data.

TECHNICAL FIELD

Aspects of the present disclosure relate to data storage systems, and inparticular, systems and methods for allocating and managing resourcesfor a deduplication table and for assigning priorities to data stored ina cache.

BACKGROUND

As the demand for data storage continues to increase, larger and moresophisticated storage systems are being designed and deployed. Manylarge scale data storage systems utilize storage appliances that includearrays of storage media. Multiple storage appliances may be networkedtogether to form a cluster, which allows for an increase in the volumeof stored data. The increase in the number of components, the number ofusers, and the volume of data often results in disparate users creatingseparate but identical copies of data, leading to exponential growth inphysical storage capacity. For example, multiple members of a businessmay use the same operating system or store the same document. In suchcases, data deduplication technologies can significantly increase datastorage efficiency and reduce cost. Data deduplication technologiesremove redundancy from stored data by storing unique data a single timeand subsequent, redundant copies of that data as indices in adeduplication table pointing to the unique data. As a result, data canbe stored in a fraction of the physical space that would otherwise berequired. For example, 100 copies of a 10 gigabyte (GB) operating systemcan be stored with 10 GB of physical capacity, and 1000 copies of thesame 1 megabyte (MB) file can be stored with 1 MB of physical capacity.

Memory caching is widely used in data storage systems. Reading from andwriting to cache memory is significantly faster than accessing otherstorage media, such as accessing spinning media. Data deduplicationinvolves performing a lookup into the deduplication table prior towriting data to determine if the data is a duplicate of existing data.As such, to perform deduplication efficiently and not impact systemresponse time, many data storage systems store the deduplication tablein cache memory, such as a direct random access memory (DRAM) basedcache. However, cache memory remains significantly more expensive thanother storage media. Consequently, cache memory is usually only afraction of the size of other storage media in a data storage system.

In some cases, a deduplication table can grow unbounded, beyond the sizeof the available memory cache. While this allows for the deduplicationof arbitrary amounts of data storage, portions of the deduplicationtable may be evicted from the memory cache as the size of thededuplication table grows. Specifically, if cache memory is full,existing data must be evicted from the cache memory before new data maybe stored. Many caching systems and methods evict data based onalgorithms that track recency (evicting data that has been leastrecently used), frequency (evicting data that has been least frequentlyused), or some combination of recency and frequency. However, suchalgorithms fail to identify the importance of data, resulting inimportant data that is not recently or frequently used, such as all orportions of the deduplication table, being evicted from the memory cacheinto other storage media, such as flash or spinning disks. When all or aportion of the deduplication table is stored in flash or disks, read andwrite request overhead is substantially increased, resulting insignificantly reduced system performance.

It is with these observations in mind, among others, that variousaspects of the present disclosure were conceived and developed.

SUMMARY

Implementations described and claimed herein address the foregoingproblems by providing systems and methods for prioritizing data in acache. In one implementation, a write request to write data to a cacheis received. The cache has one or more states. A priority level isassigned to the data using at least one processor. The priority leveland a cache replacement policy dictates an order in which the data isevicted from the cache. The data is written to the one or more states inthe cache based on the assigned priority level.

Other implementations are also described and recited herein.

BRIEF DESCRIPTION OF THE DRAWINGS

Example implementations are illustrated in referenced figures of thedrawings. It is intended that the implementations and figures disclosedherein are to be considered illustrative rather than limiting.

FIG. 1 is an example file system incorporating a deduplication tablestored in cache memory;

FIG. 2 is a flow chart illustrating example operations for allocatingand managing resources for a deduplication table;

FIG. 3 is an example cache system storing data according to assignedpriorities;

FIG. 4 is a flow chart illustrating example operations for assigningpriorities to data stored in a cache;

FIG. 5 is an example network environment that may implement varioussystems and methods of the presently disclosed technology; and

FIG. 6 is an example computing system that may implement various systemsand methods of the presently disclosed technology.

DETAILED DESCRIPTION

Aspects of the present disclosure involve systems and methodologies toincrease efficiency of access to important data, such as a deduplicationtable, by keeping such data resident in cache memory. In one aspect, anadministrator has the option to allocate and manage cache memory for thededuplication table. In other words, the administrator sets the amountof cache memory designated to the deduplication table, putting an upperlimit on the size of the deduplication table. If writing a new entry inthe deduplication table will cause the size of the deduplication tableto exceed the upper limit, deduplication of new unique blocks is turnedoff or otherwise prohibited. Specifically, new unique data will bestored, but an entry corresponding to the new unique data will not becreated in the deduplication table. Accordingly, the size of thededuplication table is capped, ensuring that it will be resident incache memory.

In another aspect, important data, such as the deduplication table, iskept resident in cache memory. A priority state, such as critical, high,and normal, is assigned to cached data. The priority state dictates howlong the data is cached and consequently, the order in which the data isevicted from the cache memory. Data assigned a priority state ofcritical will be resident in cache memory unless heavy memory pressurecauses the system to reclaim memory and all data assigned a prioritystate of high or normal has been evicted. High priority data is cachedlonger than normal priority data, with normal priority data beingevicted first. Accordingly, important data assigned a priority state ofcritical, such as the deduplication table, is kept resident in cachememory at the expense of other data, regardless of the frequency orrecency of use of the data.

FIG. 1 is an example file system 100 incorporating a deduplication tablestored in cache memory. An implementation of the file system 100comprises a processor 102, a checksum module 104, a deduplication module106, a level 1 cache 108, a level 2 cache 110, and a disk 112 incommunication via a bus 114.

The file system 100 caches data in a hierarchy to optimize performancewhile reducing monetary cost. The level 1 cache 108 may correspond toany tangible storage medium that stores data, and may be a volatilestorage media such as direct random access memory (“DRAM”). Certaindata, such as frequently-accessed or recently-accessed data, that speedsup the operation of the processor 102 during read/write operations isstored in the level 1 cache 108. In one implementation, the level 1cache 108 uses a variant of the Adaptive Replacement Cache (“ARC”)algorithm, as described with respect to FIG. 3. Data allowing for sloweraccess, such as data less frequently or recently used, is stored in thelevel 2 cache 110 or the disk 112. The level 2 cache 110 and the disk112 may be persistent non-volatile storage, with the level 2 cache 110comprising faster memory and/or storage devices relative to the disk112. In one implementation, the level 2 cache 110 comprises flash memorybased solid state disks and the disk 112 comprises hard disk drives. Thelevel 2 cache 110 may be, for example, L2ARC.

The processor 102 issues a write request via the bus 114 to write datato one or more of the level 1 cache 108, the level 2 cache 110, and thedisk 112. The checksum module 104 and the deduplication module 106intercept the data from the write request to perform deduplicationoperations to eliminate the storage of duplicate copies of data.Deduplication may be implemented at various granularity levels,including file, block, and byte levels. File level deduplication removesredundancy between files but cannot remove redundancy within a specificfile. Block-level deduplication removes redundancy both within a fileand between files. For example, most of a virtual machine image file isduplicated data (e.g., a guest operating system) with some blocks ofdata being unique to each virtual machine. By implementing a block-leveldeduplication, only blocks of data that are unique to each virtualmachine image consume additional storage space and all other duplicateblocks of data are shared. Byte-level duplication, which is the finestgranularity, removes duplicate bytes. While file-level deduplicationinitially has the lowest overhead, if there is any change to a block ofdata in a file, deduplication operations must be performed again. Inother words, if one block of data in a file changes, the two copies ofthe file are no longer identical. Accordingly, file-level deduplicationis not suited for use if the file system 100 manages files, such asvirtual machine images, that are substantially identical with a fewblocks of data differing. On the other hand, byte-level deduplication iscostly because the file system 100 determines where regions ofduplicated versus unique data begin and end. Block level deduplicationprovides a general deduplication with less overhead than byte-leveldeduplication. As such, in a particular implementation, the file system100 performs block-level deduplication. However, other levels ofgranularity are contemplated.

In one implementation, deduplication is synchronous (i.e., real time orin-line). Specifically, duplicate data is removed as it appears during awrite request from the processor 102. The checksum module 104 computes achecksum for a block of data using a hash function that uniquelyidentifies the block of data with a significantly high level ofprobability. In one implementation, the checksum module 104 uses asecure hash, such as a 256-bit block checksum function that iscryptographically strong (e.g. SHA256). A secure hash has such asubstantially small likelihood of producing the same output given twodifferent inputs that if two blocks of data have the same checksum,there is a substantially high probability that the two blocks of dataare the same block. In some implementations, a user may perform averification to compare each block of data with duplicate blocks toconfirm that the blocks are identical. Furthermore, the checksum module104 may use a weaker hash function in combination with verificationoperations during deduplication.

A deduplication table 116 tracks unique blocks of data by mapping thechecksum associated with the block of data to its storage location andreference count. In other words, the deduplication module 106 comparesthe checksum computed by the checksum module 104 to checksum entries 118in the deduplication table 116. If the checksum matches one of thechecksum entries 118, the block of data is a duplicate of existing data.Rather than allocating additional storage space to the duplicate block,the deduplication module increments a reference count 120 of theexisting data, which indicates whether the block of data is highlyreplicated. For example, if the reference count 120 for a block of datais zero, the block of data is not in use and may be deleted. On theother hand, if the reference count 120 for a block of data is a largenumber, the block of data is used by multiple users, applications,functions, etc. The write operation is concluded by returning a blockpointer 122 referencing a location where the block of data is stored.

If the checksum generated by the checksum module 104 does not match anyof the checksum entries 118, the block of data is new unique data.Consequently, the block of data is written to a storage location in thelevel 1 cache 108, the level 2 cache 110, or the disk 112. Thededuplication module 106 creates an entry in the deduplication table 116corresponding to the new unique data. The checksum of the block of datais added to the checksum entries 118 with a corresponding referencecount 120 of one and a block pointer 122 mapping the checksum to thestorage location of the block of data. The write operation is concludedby returning the block pointer 122 referencing the location where theblock of data is stored.

As shown in FIG. 1, the deduplication table 116 is stored in the level 1cache 108. This is because deduplication operations are executed fasterif the deduplication table 116 is stored in DRAM, slower if stored inthe level 2 cache 110, and slower still if stored in the disk 112. Thisis because, as discussed above, during deduplication operations, thededuplication module 106 first reads the entries in the deduplicationtable 116 to determine if the checksum is present in the checksumentries 118. If the deduplication table 116 is stored in the level 2cache 108 or the disk 112, the time spent waiting for the deduplicationmodule 106 to read the deduplication table 116 is lengthy. As a result,with the processor 102 issuing many write requests at any given time,system performance would be greatly reduced. Accordingly, to increaseefficiency, the deduplication table 116 is stored entirely within thelevel 1 cache 108.

However, if no restrictions are placed on the ability to deduplicatedata, the deduplication table 116 grows unbounded. This is because everyunique block of data has an entry in the deduplication table 116. Asdiscussed herein, additional capacity in the level 1 cache 108 is costlyand consequently relatively limited. Moreover, if the deduplicationtable 116 consumes too much space in the level 1 cache 108, it reducesthe amount of space available for other data.

The deduplication module 106 ensures that the entire deduplication table116 is kept resident in the level 1 cache 108 by limiting the size ofthe deduplication table 116 such that it does not exceed an upper limitthat the level 1 cache 108 is configured to support. In oneimplementation, a user is provided with an option to designate theamount of memory in the level 1 cache 108 allocated to the deduplicationtable 116. The allocation ensures that there is enough memory in thelevel 1 cache 108 to store the deduplication table 116, while setting anupper limit on the size of the deduplication table 116. The user willnot be able to allocate less memory in the level 1 cache 108 than thesize of the deduplication table 116 or more memory than the memoryavailable in the level 1 cache 108.

If the processor 102 issues a write request for a new unique block ofdata and the deduplication module 106 determines that creating a newentry in the deduplication table 116 will cause the size of thededuplication table 116 to exceed the upper limit, the deduplicationmodule 106 turns off the deduplication of new unique blocks. In otherwords, the write request will be completed, with the new unique block ofdata being stored, but an entry corresponding to the new unique data isnot created in the deduplication table 116. However, if the processor102 issues a write request for a duplicate block of data, thededuplication module 106 may still increment the reference count 120associated with the checksum 118 for the block of data. Accordingly, thesize of the deduplication table 116 is capped, ensuring that thededuplication table 116 will be stored entirely within the level 1 cache108.

In one implementation, once the upper limit of the deduplication table116 is reached or nearly reached, the deduplication module 106 removesentries in the deduplication table 116 to make space available for newentries. For example, if an entry has a reference count 120 of one (1)for a certain length of time, the deduplication module 106 may deletethe entry. Evictions may proceed based on the reference count 120 aloneor in conjunction with other information, such as time in the level 1cache 108. If the processor 102 issues a write request for a block datacorresponding to the checksum of the deleted entry, a new entry will becreated in the deduplication table 116. As another example, thededuplication module 106 may delete entries in the deduplication table116 based on user input.

FIG. 2 is a flow chart illustrating example operations 200 forallocating and managing resources for a deduplication table. In oneimplementation, in response to a write request for a chunk of data(e.g., a file, a block, or a byte) a computing operation 202 computes achecksum for the data using a hash function that uniquely identifies thechunk of data with a significantly high level of probability. Asearching operation 204 searches a deduplication table for the checksum.An operation 206 determines determine whether the checksum matches anentry in the deduplication table.

If the checksum computed at the computing operation 202 matches achecksum entry in the deduplication table, the chunk of data is aduplicate of existing data. Consequently, an incrementing operation 208increments a reference count associated with the checksum entry in thededuplication table. A returning operation 210 concludes the writerequest by returning a pointer referencing a location where the existingdata is stored.

If the checksum computed at the computing operation 202 does not matchany of the checksum entries in the deduplication table, the chunk ofdata is new unique data. A writing operation 212 writes the data to astorage location. An operation 214 determines whether adding a new entryto the deduplication table will cause the size of the deduplicationtable to exceed an upper limit, which may be set, for example, by auser.

If adding a new entry to the deduplication table will not cause the sizeof the deduplication table to exceed the upper limit, an addingoperation 216 will create a new entry in the deduplication tablecorresponding to the new unique data written during the writingoperation 212. The returning operation 210 concludes the write requestby returning a pointer referencing a location where the new unique datais stored.

If adding a new entry to the deduplication table will cause the size ofthe deduplication table to exceed the upper limit, deduplication of newunique blocks is turned off and a new entry is prevented from beingadded to the deduplication table. In other words, after the writingoperation 212 writes the new unique data, the returning operation 210concludes the write request by returning a pointer referencing thelocation where the new unique data is stored. A new entry correspondingto the new unique data is not created in the deduplication table.

As shown in FIG. 3, in one implementation, the level 1 cache 108 uses avariant of the Adaptive Replacement Cache (“ARC”) algorithm. The level 1cache 108 maintains a cache directory split into a Most Recently Used(“MRU”) list and a Most Frequently Used (“MFU”) list. The MRU list isdivided into two dynamic portions MRU 302 and ghost MRU 304, and the MFUlist is divided into two dynamic portions MFU 306 and ghost MFU 308.

The MRU 302 and the MFU 306 are actual cache memory. The MRU 302maintains a list of recently accessed cache entries, and the MFU 306maintains a list of frequently accessed cache entries (i.e., entriesthat are referenced at least twice). The MRU 302 sorts the list based onthe time of the most recent access, with new entries or cache hits atthe top, pushing entries down until no free space exists in the level 1cache 108 resulting in the bottom entry being evicted. Similarly, theMFU 306 sorts the list such that frequently accessed entries are at thetop of the list, and entries that are accessed less frequently are nearthe bottom of the list and eventually evicted if no free space exists inthe level 1 cache 108. In other words, the MRU 302 and the MFU 306 eachhave a target size that may be dynamically adjusted as well as a maximumsize that is a percentage of the size of the level 1 cache 108. If theMRU 302 and MFU 306 are within these bounds, no evictions occur. If theMRU 302 reaches its target size and the MFU 306 has not reached itstarget size, the size of the MRU 302 may be increased at the expense ofthe MFU 306. Similarly, the size of the MFU 306 may be increased at theexpense of the MRU 302. However, if the MRU 302 and MFU 306 reach themaximum size such that there is no free space in the level 1 cache 108,the bottom entry is evicted as new entries are added. Other ways ofmanaging the size of each portion of the level 1 cache 108 and/or theeviction of data from the level 1 cache 108 or each portion of the level1 cache 108 are also possible.

The ghost MRU 304 and the ghost MFU 308 each comprise a list trackingdata recently evicted from the MRU 302 and the MFU 306, respectively.The ghost MRU 304 list and the ghost MFU 308 list only contain metadata(references for the evicted entries), not the cache entry itself.

The MRU 302 and the MFU 304 evict data based on recency and frequency ofuse. Consequently, important data that is not recently or frequentlyaccessed may be evicted from the level 1 cache 108. For example,accesses across the deduplication table 116 tend to be random. As aresult, portions of the deduplication table 116 may be evicted from thelevel 1 cache 108 into the level 2 cache 110 or the disk 112, whichreduces efficiency of deduplication operations and overall systemperformance, as described herein.

To ensure that important data remains in the level 1 cache 108 as datais written to the level 1 cache 108, the important data is assigned apriority state indicating the relative importance of the data. Thepriority state dictates how long the data is stored in the level 1 cache108, and consequently, the order in which the data is evicted from thelevel 1 cache 108.

In one implementation, data is assigned a priority of critical, high, ornormal. Further, critical data is stored in a critical state 310 in thelevel 1 cache 108, high priority data is stored in the MFU 306, andnormal data is stored in the MRU 302. Data is evicted from the level 1cache 108 based on the priority state in conjunction with MRU or MFUprocessing. In one specific possible arrangement, the level 1 cache 108evicts data from the MRU 302 first, and once all the data in the MRU 302is evicted, the level 1 cache 108 evicts data from the MFU 306. In thisscheme, data in the critical state 310 is stored in the level 1 cache108 at the expense of the remaining data in the level 1 cache 108.Specifically, data in the critical state 310 is not evicted from thelevel 1 cache 108 unless heavy memory pressure causes the level 1 cache108 to reclaim memory and all data in the MRU 302 and MFU 306 has beenevicted. Accordingly, important data, assigned a priority state ofcritical is kept resident in the level 1 cache 108 at the expense ofother data, regardless of the frequency or recency of use of the data.

Data assigned a normal priority level is stored in the MRU 302 andevicted in conjunction with MRU processing, and data assigned a highpriority level is stored in the MFU 306 and evicted in conjunction withMFU processing. As discussed above, as recently accessed data is addedto the MRU 302, the least recently accessed data is evicted if no freespace exists in the level 1 cache 108. In one implementation, all datastored in the MRU 302 is assigned the normal priority level.Accordingly, data assigned the normal priority level is cached based onhow recently the data was accessed. A new entry for data assigned thenormal priority level is added to the top of the recently used list. Thedata moves down the recently used list unless a cache hit moves the datato back to the top of the recently used list. If no free space exists inthe level 1 cache 108, the least recently used data is evicted from theMRU 302. For example, if the MFU 306 and/or the critical state 310 reacha size where additional space is needed, the MRU 302 evicts the leastrecently used data. Similarly, in one implementation, all data stored inthe MFU 306 is assigned the high priority level and is cached based onhow frequently the data was accessed. Data assigned the high prioritydata is not evicted from the MFU 306 until data stored in the MRU 302 isevicted, and if there is no free space after the data stored in the MRU302 is evicted, the least frequently used data is evicted from the MFU306. As detailed above, data assigned a critical priority level isevicted only after the data in the MRU 302 and the MFU 306 is evicted.In other implementations, the normal, high, and critical priority levelsmay each comprise sub-priority levels such that data is ranked in theMRU 302, the MFU 306, and the critical state 310 based on thesub-priority levels. In still other implementations, some data is notassigned a priority level and is cached as a lowest priority in thelevel 1 cache 108. Here, the lack of a priority infers a normalpriority. Similarly, the lack of a priority may infer a criticalpriority and is treated in the manner described relative to criticalpriority data.

The priority level may be determined, for example, based on object type,user settings, or commands received from an application or system. Inone implementation, the deduplication table 116 is assigned criticalpriority. Further, an application may issue a command to assignapplication data 312 a critical or other priority. Other data 314,including, but not limited to block allocation maps (data structurestracking storage locations of data blocks that are allocated or free)may be assigned a critical priority.

In one implementation, the critical state 310 cache is not pre-allocatedspace in the level 1 cache 108 memory. Instead, the critical state 310cache consumes only as much space as needed until a maximum size isreached. The maximum size of the critical state 310 cache may be asubstantial portion of the total level 1 cache 108, with enoughremaining memory for the level 1 cache 108 to operate. Thus, as thecritical state 310 cache portion grows, the MRU 302 and MFU 306 cacheportions shrink, with the cache areas dynamically adjusting to the typesof data using the cache. In one implementation, the maximum size of thecritical state 310 is greater than the upper limit of the deduplicationtable 116, ensuring that the entire deduplication table 116 will beassigned a critical priority and stored in the level 1 cache 108.

If the maximum size of the critical state 310 is nearly the size of thelevel 1 cache 108 (e.g., 15/16 of the level 1 cache 108) and the maximumsize of the critical state 310 is reached such that critical data isbeing evicted, system performance may be significantly impacted. In suchcases, feedback may be generated to warn the user about the potentialfor reduced system performance and suggest options for remedying theproblem based on user preferences. For example, the user may add morememory to the level 1 cache 108, remove data, such as a storage pool(e.g., zpool), or restore a server that failed over into another storagepool causing the level 1 cache 108 to be temporarily oversubscribed.

FIG. 4 is a flow chart illustrating example operations 400 for assigningpriorities to data stored in a cache. In one implementation, a settingoperation 402 sets a maximum size of a critical state. Any data storedin the critical state is kept at the expense of any other data,regardless of the frequency or recency of use of the data stored in thecritical state. Specifically, data in the critical state is cachedlonger and is not evicted unless all other data has already beenevicted. In one implementation, the setting operation 402 sets themaximum size of the critical state at the time the file system is bootedbased on a value set forth in a system configuration file. The settingoperation 402 may set the maximum size of the critical state at asubstantial portion of the memory, leaving only enough remaining memoryto carry out operations. Additionally, the setting operation 402 mayprevent an administrator from setting the maximum size below a thresholdvalue. In another implementation, the setting operation 402 sets themaximum size at a designated proportion (e.g., 15/16) of the availablememory capacity.

A second setting operation 404 sets an upper limit on a size of adeduplication table. For example, a user may designate an amount of DRAMto allocate to the deduplication table. The upper limit ensures there isenough memory to store the complete deduplication table, whilepreventing the size of the deduplication table to grow beyond themaximum size of the critical state. In one implementation, the secondsetting operation 404 prevents a user from allocating less memory than acurrent size of the deduplication table and from allocating more memorythan the maximum size of the critical state set in the setting operation402.

A receiving operation 406 receives a write command for a chunk of data,which may be a file, a block, or a byte. An assigning operation 408assigns a priority level to the chunk of data. In one implementation,the priority level may be critical, high, or normal. The priority statedictates how long the data is cached, and consequently, the order inwhich the data is evicted. Data assigned a priority state of criticalwill remain resident in cache memory unless heavy memory pressure causesthe system to reclaim memory and all data assigned a priority state ofhigh or normal has been evicted. High priority data is cached longerthan normal priority data, with normal priority data being evictedfirst.

An operation 410 determines whether the critical maximum cache size isreached or exceeded. If the critical maximum has not been reached orexceeded, a writing operation 412 writes the data according to thepriority assigned in the assigning operation 408. For example, thewriting operation 412 may write data assigned a critical priority to thecritical state; the writing operation 412 may write data assigned a highpriority to a most frequently used cache directory; and the writingoperation 412 may write data assigned a normal priority to a mostrecently used cache directory.

On the other hand, if the critical maximum has been reached or exceeded,such that critical data is being evicted, system performance may besignificantly impacted. In such cases, a generating operation 414provides feedback to warn the user about the potential for reducedsystem performance and suggests options for remedying the problem basedon user preferences. In some implementations, the generating operation414 submits a report or issues an alert to the user. The generatingoperation 414 may suggest, for example, adding more memory, removingdata, such as a storage pool (e.g., zpool), or restoring a server thatfailed over into another storage pool causing the memory, and in somecases the critical state, to be temporarily oversubscribed.

FIGS. 5 and 6 show an example network environment 500 and an examplecomputing system 600, respectively, that may implement various systemsand methods of the presently disclosed technology. Referring to FIG. 5,disks 502 and 504 are connected to one or more storage appliances 506,508, which may be configured according to the systems and methodsdescribed herein, for example, with respect to the file system 100 ofFIG. 1. One or more clients 510, 512 may have a need for data that isstored on one of the storage appliances 506, 508. The clients 510, 512may access data from the storage appliances 506, 508 using a network514.

Referring to FIG. 6, a general purpose computer system 600 is capable ofexecuting a computer program product to execute a computer process. Dataand program files may be input to the computer system 600, which readsthe files and executes the programs therein. Some of the elements of thegeneral purpose computer system 600 are shown in FIG. 6, wherein aprocessor 602 is shown having an input/output (I/O) section 604, aCentral Processing Unit (CPU) 606, and memory 608.

There may be one or more processors 602, such that the processor 602 ofthe computer system 600 comprises the CPU 606 or a plurality ofprocessing units, commonly referred to as a parallel processingenvironment. The computer system 600 may be a conventional computer, adistributed computer, or any other type of computer, such as one or moreexternal computers made available via a network architecture, forexample as described with respect to FIG. 5. The presently describedtechnology is optionally implemented in software devices loaded in thememory 608, stored on a configured DVD/CD-ROM 610 or a storage unit 612,and/or communicated via a network link 614, thereby transforming thecomputer system 600 in FIG. 6 to a special purpose machine forimplementing the operations described herein.

The I/O section 604 is connected to one or more user-interface devices(e.g., a keyboard 616 and a display unit 618), the storage unit 612, anda disk drive 620. In one implementation, the disk drive 620 is aDVD/CD-ROM drive unit capable of reading the DVD/CD-ROM 610, whichtypically contains programs and data 622. In another implementation, thedisk drive 620 is a solid state drive unit.

Computer program products containing mechanisms to effectuate thesystems and methods in accordance with the presently describedtechnology may reside in the memory 604, on the storage unit 612, on theDVD/CD-ROM 610 of the computer system 600, or on external storagedevices made available via a network architecture with such computerprogram products, including one or more database management products,web server products, application server products, and/or otheradditional software components. Alternatively, the disk drive 620 may bereplaced or supplemented by a floppy drive unit, a tape drive unit, orother storage medium drive unit. The network adapter 624 is capable ofconnecting the computer system 600 to a network via the network link614, through which the computer system 600 can receive instructions anddata embodied in a carrier wave. An example of such systems is personalcomputers. It should be understood that computing systems may alsoembody devices such as Personal Digital Assistants (PDAs), mobilephones, tablets or slates, multimedia consoles, gaming consoles, set topboxes, etc.

When used in a LAN-networking environment, the computer system 600 isconnected (by wired connection or wirelessly) to a local network throughthe network interface or adapter 624, which is one type ofcommunications device. When used in a WAN-networking environment, thecomputer system 600 typically includes a modem, a network adapter, orany other type of communications device for establishing communicationsover the wide area network. In a networked environment, program modulesdepicted relative to the computer system 600 or portions thereof, may bestored in a remote memory storage device. It is appreciated that thenetwork connections shown are examples of communications devices for andother means of establishing a communications link between the computersmay be used.

In an example implementation, deduplication table management and/or datapriority assignment software and other modules and services may beembodied by instructions stored on such storage systems and executed bythe processor 602. Some or all of the operations described herein may beperformed by the processor 602. Further, local computing systems, remotedata sources and/or services, and other associated logic representfirmware, hardware, and/or software configured to control data access.Such services may be implemented using a general purpose computer andspecialized software (such as a server executing service software), aspecial purpose computing system and specialized software (such as amobile device or network appliance executing service software), or othercomputing configurations. In addition, one or more functionalities ofthe systems and methods disclosed herein may be generated by theprocessor 602 and a user may interact with a Graphical User Interface(GUI) using one or more user-interface devices (e.g., the keyboard 616,the display unit 618, and the user devices 604) with some of the data inuse directly coming from online sources and data stores.

In the present disclosure, the methods disclosed may be implemented assets of instructions or software readable by a device. Further, it isunderstood that the specific order or hierarchy of steps in the methodsdisclosed are instances of example approaches. Based upon designpreferences, it is understood that the specific order or hierarchy ofsteps in the method can be rearranged while remaining within thedisclosed subject matter. The accompanying method claims presentelements of the various steps in a sample order, and are not necessarilymeant to be limited to the specific order or hierarchy presented.

The described disclosure may be provided as a computer program product,or software, that may include a machine-readable medium having storedthereon instructions, which may be used to program a computer system (orother electronic devices) to perform a process according to the presentdisclosure. A machine-readable medium includes any mechanism for storinginformation in a form (e.g., software, processing application) readableby a machine (e.g., a computer). The machine-readable medium mayinclude, but is not limited to, magnetic storage medium (e.g., floppydiskette), optical storage medium (e.g., CD-ROM); magneto-opticalstorage medium, read only memory (ROM); random access memory (RAM);erasable programmable memory (e.g., EPROM and EEPROM); flash memory; orother types of medium suitable for storing electronic instructions.

The description above includes example systems, methods, techniques,instruction sequences, and/or computer program products that embodytechniques of the present disclosure. However, it is understood that thedescribed disclosure may be practiced without these specific details.

It is believed that the present disclosure and many of its attendantadvantages will be understood by the foregoing description, and it willbe apparent that various changes may be made in the form, constructionand arrangement of the components without departing from the disclosedsubject matter or without sacrificing all of its material advantages.The form described is merely explanatory, and it is the intention of thefollowing claims to encompass and include such changes.

While the present disclosure has been described with reference tovarious embodiments, it will be understood that these embodiments areillustrative and that the scope of the disclosure is not limited tothem. Many variations, modifications, additions, and improvements arepossible. More generally, embodiments in accordance with the presentdisclosure have been described in the context of particularimplementations. Functionality may be separated or combined in blocksdifferently in various embodiments of the disclosure or described withdifferent terminology. These and other variations, modifications,additions, and improvements may fall within the scope of the disclosureas defined in the claims that follow.

What is claimed is:
 1. A method comprising: receiving a write request towrite data to a cache having one or more states; assigning a prioritylevel to the data using at least one processor, the priority level and acache replacement policy dictating an order in which the data is evictedfrom the cache; and writing the data to the one or more states in thecache based on the assigned priority level.
 2. The method of claim 1,wherein the one or more states includes a first state, a second state,and a third state, and data assigned a first priority level is stored inthe first state, data assigned a second priority level is stored in thesecond state, and data assigned a third priority level is stored in thethird state.
 3. The method of claim 2, wherein the first priority levelis critical, the second priority level is high, and the third prioritylevel is normal.
 4. The method of claim 2, wherein the data assigned thefirst priority level is evicted from the first state only after both thedata assigned the second priority level and the data assigned the thirdpriority level is evicted.
 5. The method of claim 4, wherein adeduplication table is assigned the first priority level to ensure thatthe deduplication table is accessible from the cache.
 6. The method ofclaim 2, wherein data assigned the second priority level is evicted fromthe second state after data assigned the first priority level is evictedfrom the first state.
 7. The method of claim 2, wherein the cachereplacement policy comprises maintaining a first list in the secondstate and a second list in the third state, the first list being a mostrecently used list having a most recently used position and a leastrecently used position and the second list being a most frequently usedlist having a most frequently used position and a least frequently usedposition, data in the least recently used position being evicted whenthere is no available space in the cache and data is added to the mostrecently used position and data in the least frequently used positionbeing evicted when there is no available space in the cache and data isadded to the most frequently used position.
 8. The method of claim 2,wherein a size of the first state continues to grow without evictingdata assigned the first priority level until a maximum size of the firststate is reached.
 9. The method of claim 8, wherein the maximum size isgreater than an upper limit of cache allocated to a deduplication tableand less than a maximum capacity of the cache.
 10. One or more tangiblecomputer-readable storage media storing computer-executable instructionsfor performing a computer process on a computing system, the computerprocess comprising: receiving a write request to write data to a cachehaving one or more states; assigning a priority level to the data, thepriority level and a cache replacement policy dictating an order inwhich the data is evicted from the cache; and writing the data to theone or more states in the cache based on the assigned priority level.11. The one or more tangible computer-readable storage media of claim10, wherein the one or more states includes a first state, a secondstate, and a third state, and data assigned a first priority level isstored in the first state, data assigned a second priority level isstored in the second state, and data assigned a third priority level isstored in the third state.
 12. The one or more tangiblecomputer-readable storage media of claim 11, wherein the first prioritylevel is critical, the second priority level is high, and the thirdpriority level is normal.
 13. The one or more tangible computer-readablestorage media of claim 11, wherein the data assigned the first prioritylevel is evicted from the first state only after both the data assignedthe second priority level and the data assigned the third priority levelis evicted.
 14. The one or more tangible computer-readable storage mediaof claim 13, wherein a deduplication table is assigned the firstpriority level to ensure that the deduplication table is accessible fromthe cache.
 15. A system comprising: a cache having one or more statesstoring data based on a priority level assigned to the data, thepriority level and a cache replacement policy dictating an order inwhich the data is evicted from the cache.
 16. The system of claim 15,wherein the one or more states includes a first state, a second state,and a third state, and data assigned a first priority level is stored inthe first state, data assigned a second priority level is stored in thesecond state, and data assigned a third priority level is stored in thethird state.
 17. The system of claim 16, wherein the first prioritylevel is critical, the second priority level is high, and the thirdpriority level is normal.
 18. The system of claim 16, wherein the dataassigned the first priority level is evicted from the first state onlyafter both the data assigned the second priority level and the dataassigned the third priority level is evicted.
 19. The system of claim16, wherein a size of the first state continues to grow without evictingdata assigned the first priority level until a maximum size of the firststate is reached.
 20. The system of claim 19, wherein the maximum sizeis greater than an upper limit of cache allocated to a deduplicationtable and less than a maximum capacity of the cache.